NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

NestedBD: Bayesian inference of phylogenetic trees from single-cell copy number profiles under a birth-death model

https://doi.org/10.1186/s13015-024-00264-4

Liu, Yushu; Edrisi, Mohammadamin; Yan, Zhi; Ogilvie, Huw; Nakhleh, Luay (December 2024, Algorithms for Molecular Biology)

Abstract Copy number aberrations (CNAs) are ubiquitous in many types of cancer. Inferring CNAs from cancer genomic data could help shed light on the initiation, progression, and potential treatment of cancer. While such data have traditionally been available via “bulk sequencing,” the more recently introduced techniques for single-cell DNA sequencing (scDNAseq) provide the type of data that makes CNA inference possible at the single-cell resolution. We introduce a new birth-death evolutionary model of CNAs and a Bayesian method, NestedBD, for the inference of evolutionary trees (topologies and branch lengths with relative mutation rates) from single-cell data. We evaluated NestedBD’s performance using simulated data sets, benchmarking its accuracy against traditional phylogenetic tools as well as state-of-the-art methods. The results show that NestedBD infers more accurate topologies and branch lengths, and that the birth-death model can improve the accuracy of copy number estimation. And when applied to biological data sets, NestedBD infers plausible evolutionary histories of two colorectal cancer samples. NestedBD is available athttps://github.com/Androstane/NestedBD.
more » « less
Full Text Available
Accurate integration of single-cell DNA and RNA for analyzing intratumor heterogeneity using MaCroDNA

https://doi.org/10.1038/s41467-023-44014-3

Edrisi, Mohammadamin; Huang, Xiru; Ogilvie, Huw A.; Nakhleh, Luay (December 2023, Nature Communications)

Abstract Cancers develop and progress as mutations accumulate, and with the advent of single-cell DNA and RNA sequencing, researchers can observe these mutations and their transcriptomic effects and predict proteomic changes with remarkable temporal and spatial precision. However, to connect genomic mutations with their transcriptomic and proteomic consequences, cells with either only DNA data or only RNA data must be mapped to a common domain. For this purpose, we present MaCroDNA, a method that uses maximum weighted bipartite matching of per-gene read counts from single-cell DNA and RNA-seq data. Using ground truth information from colorectal cancer data, we demonstrate the advantage of MaCroDNA over existing methods in accuracy and speed. Exemplifying the utility of single-cell data integration in cancer research, we suggest, based on results derived using MaCroDNA, that genomic mutations of large effect size increasingly contribute to differential expression between cells as Barrett’s esophagus progresses to esophageal cancer, reaffirming the findings of the previous studies.
more » « less
Phylovar: toward scalable phylogeny-aware inference of single-nucleotide variations from single-cell DNA sequencing data

https://doi.org/10.1093/bioinformatics/btac254

Edrisi, Mohammadamin; Valecha, Monica_V; Chowdary, Sunkara_B_V; Robledo, Sergio; Ogilvie, Huw_A; Posada, David; Zafar, Hamim; Nakhleh, Luay (June 2022, Bioinformatics)

Abstract MotivationSingle-nucleotide variants (SNVs) are the most common variations in the human genome. Recently developed methods for SNV detection from single-cell DNA sequencing data, such as SCIΦ and scVILP, leverage the evolutionary history of the cells to overcome the technical errors associated with single-cell sequencing protocols. Despite being accurate, these methods are not scalable to the extensive genomic breadth of single-cell whole-genome (scWGS) and whole-exome sequencing (scWES) data. ResultsHere, we report on a new scalable method, Phylovar, which extends the phylogeny-guided variant calling approach to sequencing datasets containing millions of loci. Through benchmarking on simulated datasets under different settings, we show that, Phylovar outperforms SCIΦ in terms of running time while being more accurate than Monovar (which is not phylogeny-aware) in terms of SNV detection. Furthermore, we applied Phylovar to two real biological datasets: an scWES triple-negative breast cancer data consisting of 32 cells and 3375 loci as well as an scWGS data of neuron cells from a normal human brain containing 16 cells and approximately 2.5 million loci. For the cancer data, Phylovar detected somatic SNVs with high or moderate functional impact that were also supported by bulk sequencing dataset and for the neuron dataset, Phylovar identified 5745 SNVs with non-synonymous effects some of which were associated with neurodegenerative diseases. Availability and implementationPhylovar is implemented in Python and is publicly available at https://github.com/NakhlehLab/Phylovar.
more » « less
Methods for copy number aberration detection from single-cell DNA-sequencing data

https://doi.org/10.1186/s13059-020-02119-8

Mallory, Xian F.; Edrisi, Mohammadamin; Navin, Nicholas; Nakhleh, Luay (December 2020, Genome Biology)
null (Ed.)
Full Text Available
Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data

https://doi.org/10.1371/journal.pcbi.1008012

Mallory, Xian F.; Edrisi, Mohammadamin; Navin, Nicholas; Nakhleh, Luay; Ioshikhes, Ilya (July 2020, PLOS Computational Biology)

Full Text Available
Current progress and open challenges for applying deep learning across the biosciences

https://doi.org/10.1038/s41467-022-29268-7

Sapoval, Nicolae; Aghazadeh, Amirali; Nute, Michael_G; Antunes, Dinler_A; Balaji, Advait; Baraniuk, Richard; Barberan, C_J; Dannenfelser, Ruth; Dun, Chen; Edrisi, Mohammadamin; et al (April 2022, Nature Communications)

Abstract Deep Learning (DL) has recently enabled unprecedented advances in one of the grand challenges in computational biology: the half-century-old problem of protein structure prediction. In this paper we discuss recent advances, limitations, and future perspectives of DL on five broad areas: protein structure prediction, protein function prediction, genome engineering, systems biology and data integration, and phylogenetic inference. We discuss each application area and cover the main bottlenecks of DL approaches, such as training data, problem scope, and the ability to leverage existing DL architectures in new contexts. To conclude, we provide a summary of the subject-specific and general challenges for DL across the biosciences.
more » « less

Search for: All records